Policy Gradient Reinforcement Learning for Uncertain Polytopic LPV Systems based on MHE-MPC

نویسندگان

چکیده

In this paper, we propose a learning-based Model Predictive Control (MPC) approach for the polytopic Linear Parameter-Varying (LPV) systems with inexact scheduling parameters (as exogenous signals bounds), where Time Invariant (LTI) models (vertices) captured by combinations of becomes wrong. We first to adopt Moving Horizon Estimation (MHE) scheme simultaneously estimate convex combination vector and unmeasured states based on observations model matching error. To tackle wrong LTI used in both MPC MHE schemes, then Policy Gradient (PG) Reinforcement Learning (RL) learn estimator controller so that best closed-loop performance is achieved. The effectiveness proposed RL-based MHE/MPC design demonstrated using an illustrative example.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Model-based Policy Gradient Reinforcement Learning

Policy gradient methods based on REINFORCE are model-free in the sense that they estimate the gradient using only online experiences executing the current stochastic policy. This is extremely wasteful of training data as well as being computationally inefficient. This paper presents a new modelbased policy gradient algorithm that uses training experiences much more efficiently. Our approach con...

متن کامل

Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning

Off-policy model-free deep reinforcement learning methods using previously collected data can improve sample efficiency over on-policy policy gradient techniques. On the other hand, on-policy algorithms are often more stable and easier to use. This paper examines, both theoretically and empirically, approaches to merging onand off-policy updates for deep reinforcement learning. Theoretical resu...

متن کامل

A Polyhedral Off-Line Robust MPC Strategy for Uncertain Polytopic Discrete-Time Systems

In this paper, an off-line synthesis approach to robust constrained model predictive control for uncertain polytopic discrete-time systems is presented. Most of the computational burdens are moved off-line by pre-computing a sequence of state feedback control laws that corresponds to a sequence of polyhedral invariant sets. The state feedback control laws computed are derived by minimizing the ...

متن کامل

Robot reinforcement learning accuracy-based learning classifier systems with Fuzzy Policy Gradient descent(XCS-FPGRL)

This paper presented a novel approach XCS-FPGRL to research on robot reinforcement learning. XCS-FPGRL combines covering operator and genetic algorithm. The systems is responsible for adjusting precision and reducing search space according to some reward obtained from the environment, acts as an innovation discovery component which is responsible for discovering new better reinforcement learnin...

متن کامل

Scalable Multitask Policy Gradient Reinforcement Learning

Policy search reinforcement learning (RL) allows agents to learn autonomously with limited feedback. However, such methods typically require extensive experience for successful behavior due to their tabula rasa nature. Multitask RL is an approach, which aims to reduce data requirements by allowing knowledge transfer between tasks. Although successful, current multitask learning methods suffer f...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IFAC-PapersOnLine

سال: 2022

ISSN: ['2405-8963', '2405-8971']

DOI: https://doi.org/10.1016/j.ifacol.2022.07.599